A Survey of Methods for Scaling Up Inductive Learning Algorithms

نویسندگان

  • Foster J. Provost
  • Venkateswarlu Kolluri
چکیده

Each year, one of the explicit challenges for the KDD research community is to develop methods that facilitate the use of inductive learning algorithms for mining very large databases. By collecting, categorizing, and summarizing past work on scaling up inductive learning algorithms, this paper serves to establish a common ground for researchers addressing the challenge. We begin with a discussion of important, but often tacit, issues related to scaling up learning algorithms. We highlight similarities among methods by categorizing them into three main approaches. For each approach, we then describe, compare, and contrast the di erent constituent methods, drawing on speci c examples from the published literature. Finally, we use the preceding analysis to suggest how one should proceed when dealing with a large problem, and where future research e orts should be focused. Primary contact: Foster Provost NYNEX Science and Technology, 400 Westchester Avenue, White Plains, NY 10604 email: [email protected] Phone: 914 644 2169 fax: 914 949 9566

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Survey of Methods for Scaling Up

One of the deening challenges for the KDD research community is to enable inductive learning algorithms to mine very large databases. By collecting, categorizing, and summarizing existing work on scaling up inductive algorithms, this paper serves to establish common ground for researchers addressing the challenge. We concentrate on algorithms that build decision trees and rule sets, in order to...

متن کامل

A Survey of Methods for Scaling Up InductiveAlgorithmsFOSTER

One of the deening challenges for the KDD research community is to enable induc-tive learning algorithms to mine very large databases. This paper summarizes, categorizes, and compares existing work on scaling up inductive algorithms. We concentrate on algorithms that build decision trees and rule sets, in order to provide focus and speciic details; the issues and techniques generalize to other ...

متن کامل

Scaling Up Inductive Algorithms: An Overview

This paper establishes common ground for researchers addressing the challenge of scaling up inductive data mining algorithms to very large databases, and for practitioners who want to understand the state of the art. We begin with a discussion of important, but often tacit, issues related to scaling up. We then overview existing methods, categorizing them into three main approaches. Finally, we...

متن کامل

Scaling Up Inductive Learning with MassiveParallelismFOSTER

Machine learning programs need to scale up to very large data sets for several reasons, including increasing accuracy and discovering infrequent special cases. Current inductive learners perform well with hundreds or thousands of training examples, but in some cases, up to a million or more examples may be necessary to learn important special cases with conndence. These tasks are infeasible for...

متن کامل

Scaling up Inductive Logic Programming by Learning from Interpretations Scaling up Inductive Logic Programming by Learning from Interpretations

When comparing inductive logic programming (ILP) and attribute-value learning techniques, there is a trade-oo between expressive power and eeciency. Inductive logic programming techniques are typically more expressive but also less eecient. Therefore, the data sets handled by current inductive logic programming systems are small according to general standards within the data mining community. T...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1997